{
"cells": [
{
"cell_type": "markdown",
"id": "1e468c88-42c6-4ef9-9189-3d04cdef8495",
"metadata": {},
"source": [
"# Dirichlet Distribution\n",
"\n",
"In this section, we will be showcasing the Dirichlet Distribution and using D3.js {cite:p}`bostock2011d3` to provide illustrations for concepts. "
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "ad450696-5f96-4bd9-a1f7-c1a44b5bfe20",
"metadata": {
"tags": [
"hide-cell"
]
},
"outputs": [],
"source": [
"from IPython.display import HTML\n",
"\n",
"def load_d3_in_cell_output():\n",
" display(HTML(\"\"))\n",
"get_ipython().events.register('pre_run_cell', load_d3_in_cell_output)"
]
},
{
"cell_type": "markdown",
"id": "129bb8b0-4830-4dbc-9873-e2509fb66ae7",
"metadata": {},
"source": [
"## The Chinese Restaurant Process"
]
},
{
"cell_type": "markdown",
"id": "c39b1569-bfc7-41ec-89f7-3c87a4103365",
"metadata": {},
"source": [
"In the thought problem, we will be examing a situation where a hungry person (π€) enters a restaurant and needs to choose a table (βͺ).\n",
"\n",
"This original was developed by {cite:p}`aldous1985exchangeability` and a great resource to consider is Pasupat's ({cite:p}`Pasupat_2021`).\n",
"\n",
"Here are the ground rules for this thought problem.\n",
" "
]
},
{
"cell_type": "markdown",
"id": "af48f15f-7d3b-444a-b745-8e14c895dff7",
"metadata": {},
"source": [
"## Rules for Our Thought Problem"
]
},
{
"cell_type": "markdown",
"id": "802c2fbf-1898-4c04-90a0-7ace8baedd52",
"metadata": {},
"source": [
"### 1. An Infinite Amount of Tables (βͺ)\n",
"\n",
"We are depicting five tables (βͺβͺβͺβͺβͺ), but we need to consider a situation where the number of tables is infinite. \n",
"\n",
"* βͺ = β"
]
},
{
"cell_type": "markdown",
"id": "7342c7ba-74d9-4ece-ba1b-4dc6576b827f",
"metadata": {},
"source": [
"### 2. A Hungry Person (π€) Only Two Options\n",
"\n",
"When a hungry person (π€) walks into the restaurant they have two options: \n",
" \n",
"* Either they sit a table (βͺ) with someone else (π) \n",
"* or pick a new table (βͺ) \n",
"\n",
"To simplify this, here a decision chart. "
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "05be4377-481c-4f6a-8885-54e82f716645",
"metadata": {
"tags": [
"remove-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/svg+xml": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import SVG, display\n",
"display(SVG(url='https://raw.githubusercontent.com/dudaspm/LDA_Bias_Data/main/images/startCondition.svg'))"
]
},
{
"cell_type": "markdown",
"id": "d2b89149-0e89-4dfb-9c67-513764ed4526",
"metadata": {},
"source": [
"And to further reduce this down, we will be using this:"
]
},
{
"cell_type": "markdown",
"id": "e160df99-5cce-4a39-af1d-ceeb947ea3a3",
"metadata": {},
"source": [
"### 3. Many βͺ & π, Only One Empty βͺ\n",
"\n",
"This goes with #2, but in our scenario, there is the number of tables (βͺ) with people (π), but when considering an empty table (βͺ). We will only consider *one* of the infinite number of tables (βͺ) open. Another way to consider this is either a hungry person (π€):\n",
"* sits at the *one of possible many* tables (βͺ) with someone else (π) \n",
"* *OR* they sit at the *one* new table (βͺ)"
]
},
{
"cell_type": "markdown",
"id": "33113b93-32e2-450f-a5fb-13cf335ab3e8",
"metadata": {},
"source": [
"### All Tables (βͺ) are Equal\n",
"Notice that all the tables are equal distance away. So, there is no weighting based on the distance, and each table is equally likely to be picked. "
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "ac1b59d4-c268-464a-b087-8964b9c2547a",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "782ab924-62be-4ad3-a67d-0df68a631478",
"metadata": {},
"source": [
"### Key for Thought Problem\n",
"\n",
"> π€ - hungry person\n",
"* The person who needs to find a seat at a table\n",
"\n",
"> π - person eating\n",
"* A person already at a table\n",
"\n",
"> βͺ - a possible table\n",
"* A potential seat for the hungry person to sit at\n",
"\n",
"> β« - a not possible table \n",
"* Not a potential seat for the hungry person to sit at (see Rule #3)."
]
},
{
"cell_type": "markdown",
"id": "341b4578-6590-423e-862d-426046664b41",
"metadata": {},
"source": [
"## All Solutions π₯TO THE EXTREMEπ₯"
]
},
{
"cell_type": "markdown",
"id": "387dc94a-1424-46c5-9db5-16dbaccf0e46",
"metadata": {},
"source": [
"Now that we have our ground rules let's approach this problem from, what I am calling, the extreme positions. We have not mentioned a single bit of math up to this point, but this section will contain conversations around probabilities. Here are three scenarios for our extreme positions. \n",
"\n",
"1. The Social Butterfly\n",
"2. The Gambler\n",
"3. The Long Day"
]
},
{
"cell_type": "markdown",
"id": "bec17d1f-d5ba-4a01-a740-b414c13e2f47",
"metadata": {},
"source": [
"### 1. The Social Butterfly\n",
"\n",
"The social butterfly assumes every person that enters the restaurants wants to sit at the table with the most people. "
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "8b03680d-d20e-42c6-bea6-1fbf5e658b60",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "433e57bf-1419-48fd-b152-3926a9e0f635",
"metadata": {},
"source": [
"The following person (π€) walks in and sits at the most popular table. "
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "51934260-b24a-4212-b3f7-83dbe283931b",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "9d8ff65c-4e87-4671-8b5a-cb1e0fbf7757",
"metadata": {},
"source": [
"and repeat this process for the next three customers (π€)."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "2175f40b-0e1b-454e-93da-c75686e3a859",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "ef83f290-8744-4a3c-bd23-9478abb82a8e",
"metadata": {},
"source": [
"### 2. The Gambler\n",
"\n",
"The Gambler is the person who only cares about the probabilities. Meaning, if there are two tables (βͺ), then they have a 50/50 choice, and they do not care at all about the people sitting there or not. "
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "2578cef5-bfb5-4213-afb4-f4d08524bd35",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "4f164bf4-021a-4025-898f-71795296c65b",
"metadata": {},
"source": [
"Where the probability is now $p = \\frac{1}{2}$"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "d0445c09-ef73-4f4b-9946-cb23e7b97b96",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "077009c8-bd5a-451c-9323-7c7a8d564d8a",
"metadata": {},
"source": [
"Now $p = \\frac{1}{3}$"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "2288ced5-f72d-4688-8721-da87ee0cafa5",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "ae348f82-fa58-4b29-bbd1-7b6ef9280ce4",
"metadata": {},
"source": [
"Then probability is now $p = \\frac{1}{4}$"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "46a6e553-ea93-46ee-a509-04bb92c8b79c",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "00943292-f3c7-4e93-aa62-8835c0b112a3",
"metadata": {},
"source": [
"Finally, all tables (βͺ) are probability $p = \\frac{1}{5}$"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "7dfc6ce6-f9a6-41b5-8281-876d6a04d196",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "ca29cf87-050e-444a-b988-dd377d5c202a",
"metadata": {},
"source": [
"### 3. The Long Day\n",
"\n",
"The Long Day scenario describes a situation where customers (π€) coming into the restaurant had a reeeeeeeeeeeeeeeally long day. All they want is a table (βͺ) to themselves to eat their food, pay, and go home. This scenario is the opposite of the Social Butterfly, where people are at a table (π & βͺ). They will find an empty table (βͺ).\n"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "ddf03ba3-0919-4019-b48a-3a2288c17664",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "5bf1052e-6817-48f1-8a6b-8b5588ab09d9",
"metadata": {},
"source": [
"With this selection, the customer (π€) will always pick the new table. "
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "e5e51927-29a5-4d52-b15c-ed7d34eab11f",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "ab9ad2e4-8c75-4fff-9ed0-cbb735bb19e9",
"metadata": {},
"source": [
"Repeat for all customers (π€)."
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "8f51be78-d665-4494-990f-3724d3f18362",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "230ae6ae-3ab8-4633-accd-e0dd4540c437",
"metadata": {},
"source": [
"## The Conclusions\n",
"\n",
"### β¨1st Conclusionβ¨\n",
"\n",
"So, let's take a look at all three of these scenario results."
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "d421d13c-b5e2-4751-b13d-7bd157053218",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "9a29dac2-de22-481a-944e-882d1d14ea70",
"metadata": {},
"source": [
"Our β¨1st Conclusionβ¨ is that for each scenario, the total probabilities (when added together) equal 1. This conclusion is our first connection to the *Dirichlet Distribution*. "
]
},
{
"cell_type": "markdown",
"id": "15369812-5388-47d2-bd4a-fe5de80e8c81",
"metadata": {},
"source": [
"```{admonition} Dirichlet Distribution Always Sum to 1\n",
":class: tip\n",
"Regardless of the number of tables (βͺ), the number of people at the tables (π), or a hungry persons' (π€) strategy. The total probability will be 1. This concept is also considered to be a *probability mass function* or PMF property. \n",
"```"
]
},
{
"cell_type": "markdown",
"id": "4e850ab8-697d-4ffd-9c02-0be4485080aa",
"metadata": {
"tags": []
},
"source": [
"### β¨2nd Conclusionβ¨\n",
"\n",
"This easiest to see with our \"The Gambler\" scenerio. "
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "7f436473-c6e6-4ede-83b7-846979cff2cf",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "f7e61d25-5eff-4691-a9cd-ab374f226abf",
"metadata": {
"tags": []
},
"source": [
"```{admonition} When All Possibility Are Equally Likely\n",
":class: tip\n",
"In situations where are all possibilities are equally likely (equally likely to sit at a table with someone else (βͺ&π) or sit at a new table (βͺ)), we can abbreviate this to a simple probablity:\n",
"\n",
"$\\frac{π}{πππππ€}$ \n",
"$ = $\n",
"$\\frac{\\text{Number of people sitting at table (βͺ&π)}}{\\text{All people (πππππ€)}}$ \n",
"$ = $\n",
"$\\frac{N_j}{N}$\n",
"\n",
"AND \n",
"\n",
"$\\frac{π€}{πππππ€}$ \n",
"$ = $\n",
"$\\frac{\\text{Number of people who can sit at a new table (βͺ)}}{\\text{All people (πππππ€)}}$ \n",
"$ = $\n",
"$\\frac{1}{N}$\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "2be6b1e5-8c63-4de7-96cf-1eaec00fc394",
"metadata": {},
"source": [
"```{admonition} To Join Or Not To Join\n",
":class: tip\n",
"As shown in the animation, there are two conditions: to join *or* not to join. Both of these take advantage of the $\\frac{N_j}{N}$ relationship, but it can be seen that as tables are filled, the more likely this could occur. Meaning... \n",
"\n",
"*To join*\n",
"\n",
"$\\frac{π}{πππππ€}$ \n",
"$ + $\n",
"$\\frac{π}{πππππ€}$ \n",
"$ + $\n",
"$\\frac{π}{πππππ€}$ \n",
"$ + $\n",
"$\\frac{π}{πππππ€}$ \n",
"$ + $\n",
"$ = $\n",
"$\\frac{N-1}{N}$ \n",
"\n",
"*Or Not To Join*\n",
"\n",
"$\\frac{π€}{πππππ€}$ \n",
"$ = $\n",
"$\\frac{1}{N}$ \n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f51f26f7-cf6c-42d1-bdb1-762a4f49fc52",
"metadata": {},
"source": [
"To expand on this idea, we need to look at more of a real situation. Where someone entering the restaurant is making a decision and not the extremes. In the visualization below, click on the tables to add people to that table. We are introducing a new concept as well. That being a probability (p). The first table, before anyone is seated, will be $\\frac{p}{p} = 1 $. Meaning, if we change this probability to any number, the math will always work out. Then as we add people to tables, we will keep expanding on this introduction of the probability. \n",
"\n",
"Remember, for each table, we are saying $\\frac{N_j}{N}$, where $N_j = \\text{Number of people sitting at table j (}βͺ_j\\text{&π)}$ and $N = \\text{All people (πππππ€)}$ = $\\frac{\\text{Number of people sitting at table (βͺ&π)}}{\\text{All people (πππππ€)}}$ \n",
"\n",
"Also, $\\frac{\\text{Number of people who can sit at a new table (βͺ)}}{\\text{All people (πππππ€)}}$ $ = \\frac{1}{N}$\n",
"\n",
"or simply\n",
"$\\frac{N_j}{N} + \\frac{1}{N} = 1$\n"
]
},
{
"cell_type": "markdown",
"id": "7130d92d-9b21-40b2-9cac-abe20e4245c3",
"metadata": {},
"source": [
"```{note} \n",
"In the next animation, click on the circles to add new people. \n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "2f99db35-11fb-498c-bdba-9cc12a23a1e7",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%html\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"id": "c1ae54d2-0548-4459-aa42-fb458f8d1d55",
"metadata": {},
"source": [
"This lines up perfectly with what we specified beforehand. \n",
"\n",
"$\\frac{N_0}{N+p}+\\frac{N_1}{N+p}+\\frac{N_2}{N+p}+\\frac{N_3}{N+p}+\\frac{N_4}{N+p}+\\frac{N_5}{N+p}+\\frac{N_6}{N+p}+\\frac{N_7}{N+p}+\\frac{N_8}{N+p}+\\frac{p}{N+p} = $\n",
"\n",
"$\\frac{N_0+N_1+N_2+N_3+N_4+N_5+N_6+N_7+N_8+p}{N+p} = $ where $N_0+N_1+N_2+N_3+N_4+N_5+N_6+N_7+N_8 = N$\n",
"\n",
"$\\frac{N+p}{N+p} = 1$\n",
"\n",
"or, even better\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "d68621cb-85c6-430c-8c16-fd0d5341d7a8",
"metadata": {},
"source": [
"```{admonition} The Predictive Probability\n",
":class: tip\n",
"The Dirichlet has the predicitve probability of \n",
"\n",
"$\\frac{N_j}{N+p} + \\frac{p}{N+p} = 1$\n",
"\n",
"Where, traditionally, you will see p written as alpha ($\\alpha$).\n",
"\n",
"So...\n",
"$\\frac{N_j}{N+\\alpha} + \\frac{\\alpha}{N+\\alpha} = 1$\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "94fff499-79c1-4ee4-a47e-1d1edf54a77d",
"metadata": {},
"source": [
"### π° The Rich Get Richer π°\n",
"\n",
"When dealing with the Dirichlet process, we are dealing with probabilities. As before, p or $\\alpha$, represents a probability. In this case, the person entering will more than likely want to sit at a table and, more specifically, the table with the most people. This will rarely be as extreme as \"The Social Butterfly,\" but instead be represented by the $\\frac{N_j}{N+\\alpha} + \\frac{\\alpha}{N+\\alpha} = 1$ for each table. \n",
"\n",
"In this next animation, we will simulate the same experience as above but more realistic to the Dirichlet process. "
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "4284fa30-ec71-4553-8dd4-a298e3df3c88",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"